Graphical Models for Integrated Intelligent Robot Architectures
نویسنده
چکیده
The theoretically elegant yet broadly functional capability of graphical models shows intriguing potential to span in a uniform manner perception, cognition and action; and thus to ultimately yield simpler yet more powerful integrated architectures for intelligent robots and other comparable systems. This position paper explores this potential, with initial support from an effort underway to develop a graphical architecture that is based on factor graphs (with piecewise continuous functions). Robots require a close coupling of (multiple forms of) perception and action. Intelligent robots go beyond this to require a further coupling with cognition. From the perspective of robotics – with its focus on behavior in the world – the construction of intelligent robots generally emphasizes a tightly integrated perceptuomotor system that is then loosely connected to some limited form of cognitive system (such as a planner); as for example in (Bonasso et al. 1997). From the perspective of cognitive architectures – with their focus on integrated embodiments of hypotheses concerning the fixed structure underlying intelligent behavior – the construction of intelligent robots generally emphasizes a highly functional cognitive system that is then loosely connected to limited perceptual and motor modules; as for example in (Laird and Rosenbloom 1990). Neither perspective typically strives for a deep integration across the signal-to-symbol divide that separates the perceptuomotor and cognitive systems, nor even to do full justice to what is on the other side. Other approaches are possible though. One such is a form of graphical architecture that leverages the broadly functional yet theoretically elegant construct of graphical models (Koller and Friedman 2009) to support, among other things, a uniform approach to signal and symbol Copyright © 2011, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. processing. At their core, graphical models concern efficient computation over complex multivariate functions by decomposing them into the product of simpler subfunctions. Such models have over the years produced state-of-the-art capabilities across many aspects of perception and robotics under a variety of names, such as factor graphs, Markov and conditional random fields, Bayesian and Markov networks, hidden Markov models, Kalman filters, and the Viterbi algorithm. Less commonly recognized is that they can also yield state-of-the-art symbol processing in areas such as constraint satisfaction (Dechter 2003) and rule match (Rosenbloom 2011a); and are even compatible with full first-order logic (Domingos and Lowd 2009). The key question this raises though is whether all of the requisite capabilities can be combined into a unified architecture that is based on a single form of graphical model capable of exhibiting a superset of the requisite functionality. The position staked out here is that graphical models do provide a potential basis for the development of integrated intelligent robot architectures that uniformly span the necessary capabilities across perception, cognition and action. Over the past few years I have been exploring a particular graphical architecture that leverages factor graphs (Kschischang, Frey and Loeliger 2001) for simpler yet more general combinations of capabilities resembling those found in existing cognitive architectures, while also incorporating additional capabilities that are beyond them (Rosenbloom 2011a, 2001b). Although factor graphs were originally developed in coding theory, they are the most general form of graphical model so far developed, and show promise for broad applicability across AI and robotics. Many of the mechanisms in existing cognitive architectures, while not normally cast as factor graphs, are in fact amenable to reconceptualization in this form; as are many forms of processing central to robotics. This graphical architecture is still in a relatively early stage of development, but it has already been shown to produce forms of many of the major capabilities found in state-of-the-art cognitive architectures, such as Soar (Laird 2012) and ACT-R (Anderson 2009). This has included both procedural (rule) and declarative (semantic and episodic) memories, plus a constraint memory that combines procedural and declarative aspects (Rosenbloom 2010). Beginnings of mental imagery have also been demonstrated (Rosenbloom 2011c), along with preferencebased decision making and the ability to reflect upon impasses in decision making (Rosenbloom 2011d). Simple forms of statistical language processing have also been produced – including word sense disambiguation and question answering – as have the beginnings of learning. Even more interesting from the perspective of intelligent robotics is that forms of perception (based on a CRF) localization (based on the relevant portion of SLAM), and probabilistic decision making (based on POMDPs) have also been demonstrated; and, in fact, integrated together into a single factor graph that performs a simple form of navigation in a virtual corridor (Chen et al. 2011). Although it still has far to go, this work represents an important step in developing the perception and robotics capabilities that will be needed to ground the architecture in the world. To support the initial plausibility of this paper’s position, the next two sections will explain in a bit more detail how the existing graphical architecture works, along with the structure of the combined factor graph built for perception, localization and decision making in virtual corridor navigation. The last section then wraps up with some final discussion. The Graphical Architecture The graphical architecture is based on piecewise continuous functions, to provide a broad-spectrum representational primitive, plus factor graphs that are defined over these primitives to provide more complex representations and the processing needed over them. Together these two techniques enable signals and symbols, along with uncertainty about them, to be represented and processed in a uniform manner. Piecewise continuous functions partition a multidimensional space into regions, and then specify a continuous function – which is limited to linear in the current architecture – over each region (Figure 1). Such a representation can represent arbitrary continuous functions as closely as desired, albeit at the cost of more regions. It can also represent discrete functions by introducing region boundaries at unit intervals and limiting the functions on these regions to constant values; and represent symbols by further limiting the constant values to 0/1 (for false/true) and adding a symbol table. Factor graphs are a flavor of undirected graphical model that replaces the potential functions from Markov networks (aka Markov random fields) with factor nodes in the graph itself. Figure 2, for example, shows a simple factor graph for a polynomial function, with three variable nodes and two factor nodes. Factor graphs can be solved via a range of approaches – including message passing, sampling, particle filters, and variational methods – to yield either marginals on variables or a MAP estimate. Some of the methods are exact, at least for some graphs, while others are approximate. The graphical architecture is based on a variant of the summary product algorithm, a message passing approach in which a message in either direction along a link expresses a function over the variable(s) on the link. At a variable node, messages are combined via pointwise product, a computation that is similar to an inner product except that there is no final summation and the input and output functions thus have the same rank. At a factor node, a similar product is computed, but the node’s function is also included in the product, and then all variables not to be included in the outgoing message are summarized out. Summarization may occur via integration – since these are continuous functions – to yield marginals, or via maximization to yield MAP. Figure 3, for example, shows how the algorithm generates a marginal for the variable y if evidence is provided for variables x (=5) and z (=3). In the architecture, factor graphs are specified in a highlevel language of conditionals, with each conditional consisting of some combination of conditions, actions, condacts, and a function. Conditions and actions are much like the same structures in rule-based systems; conditions match to patterns of elements in working memory and pass matches forward, while actions take information about Figure 1: 2D piecewise continuous function as an array of linear regions. Figure 2: Factor graph for f(x,y,z) = y+yz+2yx+2xz = (2x+y)(y+z) = fi(x,y)f2(y,z) matches and convert them into changes in working memory. Figure 4, for example, shows a heuristic rule from the Eight Puzzle. Condacts, a neologism for conditions and actions, meld these two functionalities. When combined with Boolean functions over their variables, condacts enable the bidirectional processing that is key to constraint satisfaction. With numerical functions, this same bidirectional activity is central to signal processing and probabilistic reasoning. Figure 5, for example, shows a conditional that defines the conditional probability of an object’s weight given its category. The architecture’s central processing cycle consists of starting with a pool of evidence stored in working-memory factor nodes, passing messages until quiescence, and then deciding what changes to make in the working-memory factor nodes. Each such cycle can include parallel waves of rule firings, access to declarative knowledge, perception, and simple forms of reasoning. Virtual Corridor Navigation The virtual corridor navigation task concerns moving from an initial location to a goal location in a one dimensional virtual corridor, given: a utility function on locations; noisy perception that is limited to detecting rectangles, circles and colors in the current location; and a fallible ability to move to adjacent locations (Figure 6). The walls at the end of the corridor appear as rectangles, while the three doors appear as colored rectangles with circles (doorknobs). Processing during each decision starts with perception via a conditional random field (CRF). Training was performed outside of the architecture, since the appropriate architectural learning mechanisms have not yet been developed, but the results were then added as conditionals to the architecture’s knowledge base, yielding a graph like that in Figure 7 for three time steps. For each such time step, sensations (S) from three (virtual) sensors arrive at the bottom and are passed through perceptual factor nodes (P) to yield distributions over the object seen (O). Objecttransition factor nodes (OT) further refine these distributions by imposing probabilistic constraints on what can be seen on adjacent time steps. The object distributions (O) feed into the localization network (Figure 8), whose task is to determine distributions over the locations (X), via map factor nodes (M) that relate objects to locations, a prior distribution on the initial location (Pr), and location-transition factor nodes (XT) that probabilistically constrain the locations for adjacent time steps. The latter also takes into consideration Figure 3: Computation via the summary product algorithm of the marginal on y from evidence on x and z. CONDITIONAL GoalReject Conditions: (Operator id:o state:s x:x y:y) (Goal state:s x:x y:y tile:t) (Board state:s x:x y:y tile:t) Actions: (Selected state:s operator:o) Figure 4: Eight Puzzle heuristic that rejects from consideration operators that move tiles out of place. CONDITIONAL ConceptWeight Conditions: (Object state:s object:o) Condacts: (Concept object:o concept:c) (Weight object:o weight:w) w\c Walker Table ... [1,10> .01w .001w ... [10,20> .2-.01w “ ...
منابع مشابه
Improving the performance of financial forecasting using different combination architectures of ARIMA and ANN models
Despite several individual forecasting models that have been proposed in the literature, accurate forecasting is yet one of the major challenging problems facing decision makers in various fields, especially financial markets. This is the main reason that numerous researchers have been devoted to develop strategies to improve forecasting accuracy. One of the most well established and widely use...
متن کاملFlexible Foot/Ankle Based on PKM with Force/Torque Sensor for Humanoid Robot
This paper describes the development of a novel humanoid robot foot/ankle based on an orientation Parallel Kinematic Mechanism for intelligent and flexible control. With three identical Universal-Prismatic-Spherical prismatic-actuated limbs and a central Universal-Revolute passive limb, the PKM can perform three degrees of freedom rotation motions. In order to enable the humanoid robot safely t...
متن کاملThe Sigma Cognitive Architecture and System
Sigma (Σ) is a nascent cognitive system – an integrated computational model of intelligent behavior, whether natural and/or artificial – that is based on a novel cognitive architecture: a model of the fixed structure underlying a cognitive system [1]. The core idea behind Sigma is to leverage graphical models [2, 3] – with their ability to yield state-of-the-art algorithms across the processing...
متن کاملStudy of Evolutionary and Swarm Intelligent Techniques for Soccer Robot Path Planning
Finding an optimal path for a robot in a soccer field involves different parameters such as the positions of the robot, positions of the obstacles, etc. Due to simplicity and smoothness of Ferguson Spline, it has been employed for path planning between arbitrary points on the field in many research teams. In order to optimize the parameters of Ferguson Spline some evolutionary or intelligent al...
متن کاملINTELLIGENT BUILDING ASSESSMENT BASED ON AN INTEGRATED MODEL OF FUZZY ANALYTIC HIERARCHY PROCESS AND FUZZY PREFERENCE DEGREE APPROACH (FAHP-FPDA)
Intelligent building (IB) technologies have widespread applications in the building design and development. In this regard, it is necessary to develop intelligent building assessment models in order to satisfy the clients, professionals, and occupants' growing demands. To this end, this paper proposes an integrated analytic hierarchy process (AHP) and preference degree approach (PDA) under the ...
متن کاملA Simulation and Design System for Tactical Driving Algorithms
Intelligentvehicles must make real-time tactical level decisions to drive in mixed traffic environments. Since repeatable testing of different algorithms in rare and potentially dangerous situations is necessary, we have developed a custom simulator for this task. SHIVA (Simulated Highways for Intelligent Vehicle Algorithms) mirrors many aspects of the Carnegie Mellon Navlab [26, 13] system, en...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012